The Filtering Approaches for the Improved Boyer-Moore Approximate String Matching
نویسنده
چکیده
The Boyer-Moore algorithm is to solve exact string matching. Here, the Bad Character Rule of the Boyer-Moore algorithm is extended to solve approximate string matching. Although Tarhio and Ukkonen introduce a basic algorithm, it is similar to the Horsool algorithm. We utilize the concept of their algorithm to implement the Bad Character Rule, and we will obtain a new shift length. When the window needs to be shifted in filtering stage, there is a chance to shift larger. This paper also explains two simple filtering approaches, and we easily combine any one of the filtering method to our algorithm. These filtering rules are easy to understand. One of them comes from the obvious concept of the definition of edit distance. Another uses a special relationship between edit distance and Hamming distance.
منابع مشابه
Approximate Boyer-Moore String Matching
The Boyer-Moore idea applied in exact string matching is generalized to approximate string matching. Two versions of the problem are considered. The k mismatches problem is to find all approximate occurrences of a pattern string (length m) in a text string (length n) with at most k mismatches. Our generalized Boyer-Moore algorithm is shown (under a mild independence assumption) to solve the pro...
متن کاملString Matching in the DNA Alphabet
Searching for occurrences of string patterns is a common problem in many applications. Various good solutions have been presented for string matching. The most efficient solutions in practice are based on the Boyer–Moore algorithm.1 A typical question in molecular biology is whether a given sequence has appeared elsewhere. In the following, we will concentrate on searching for exact occurrences...
متن کاملEnhanced Pattern Matching Performance Using Improved Boyer Moore Horspool Algorithm
In computer science, the Boyer–Moore–Horspool algorithm is an algorithm for finding substrings in strings. A pattern matching problem can be classified into software and hardware based on implemental methods. It is important of enhance pattern matching performance. This paper proposes enhanced pattern matching performance using improved Boyer Moore Horspool Algorithm. It combines the determinis...
متن کاملApproximate String Matching with Reduced Alphabet
We present a method to speed up approximate string matching by mapping the factual alphabet to a smaller alphabet. We apply the alphabet reduction scheme to a tuned version of the approximate Boyer– Moore algorithm utilizing the Four-Russians technique. Our experiments show that the alphabet reduction makes the algorithm faster. Especially in the k-mismatch case, the new variation is faster tha...
متن کاملString Matching Rules Used by Variants of Boyer-moore Algorithm
String matching problem is widely studied problem in computer science, mainly due to its large applications used in various fields. In this regards many string matching algorithms have been proposed. Boyer-Moore is most popular algorithm. Hence, maximum variants are proposed from Boyer-Moore (BM) algorithm. This paper addresses the variant of Boyer-Moore algorithm for finding the occurrences of...
متن کامل